Speaker Recognition System Based On MFCC and DCT
نویسندگان
چکیده
This paper examines and presents an approach to the recognition of speech signal using frequency spectral information with Mel frequency. It is a dominant feature for speech recognition. Mel-frequency cepstral coefficients (MFCCs) are the coefficients that collectively represent the shortterm power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a non linear mel scale of frequency. The performance of MFCC is affected by the number of filters, the shape of filters, the way that filters are spaced, and the way that the power spectrum is warped. In this paper the optimum values of above parameters are chosen to get an efficiency of 99.5 % over a very small length of audio file.
منابع مشابه
Neural "spike rate spectrum" as a noise robust, speaker invariant feature for automatic speech recognition
A new feature set for ASR called Rate-Spectrum(RS) is proposed. RS is a spectral representation obtained using a computational auditory model. The feature is noise-robust and considerably speaker invariant. RS matches the smoothed log spectrum both in shape and dynamic range variation. DCT is used to reduce dimensionality. Comparison of the proposed features with MFCC is done using an Isolated ...
متن کاملAutomatic Speaker Recognition Using Fuzzy Vector Quantization
Speaker recognition (SR) is a dynamic biometric task. SR is a multidisplinary problem that encompasses many aspects of human speech, including speech recognition, language recognition, and speech accents. This technique makes it possible to use the speaker’s voice to verify his/her identity and provide controlled access to services. The Mel-frequency extraction method is leading approach for sp...
متن کاملتشخیص لهجه های زبان فارسی از روی سیگنال گفتار با استفاده از روش های استخراج ویژگی کارآمد و ترکیب طبقه بندها
Speech recognition has achieved great improvements recently. However, robustness is still one of the big problems, e.g. performance of recognition fluctuates sharply depending on the speaker, especially when the speaker has strong accent and difference Accents dramatically decrease the accuracy of an ASR system. In this paper we apply three new methods of feature extraction including Spectral C...
متن کاملDesign, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition
Standard Mel frequency cepstrum coefficient (MFCC) computation technique utilizes discrete cosine transform (DCT) for decorrelating log energies of filter bank output. The use of DCT is reasonable here as the covariance matrix of Mel filter bank log energy (MFLE) can be compared with that of highly correlated Markov-I process. This full-band based MFCC computation technique where each of the fi...
متن کاملMulti-Biometric Person Authentication System Using Speech, Signature And Handwriting Features
-------------------------------------------------------------ABSTRACT-------------------------------------------------------Biometric Technologies are automated methods for verifying or recognizing the identity of a living person based on physiological or behavioral characteristics. Multimodal Biometric Systems are those which utilize more than one physiological or behavioral characteristic for...
متن کامل